Exploring Predicate-Argument Relations for Named Entity Recognition in the Molecular Biology Domain
نویسندگان
چکیده
In this paper, the semantic relationships between a predicate and its arguments in terms of semantic roles are employed to improve lexical-based named entity recognition (NER) in the molecular biology domain. The semantic roles were realized in various sets of syntactic features used by a machine learning model to explore what should be the efficient way in allowing this knowledge to provide the highest positive effect on the NER. The empirical results show that the best feature set consists of predicate’s surface form, predicate’s lemma, voice, and the united feature of subject-object head’s lemma and transitive-intransitive sense. The performance improvement from using these features indicates the advantage of the predicate-argument semantic knowledge on NER. There are still rooms to enhance NER by using this semantic knowledge (e.g. to employ other semantic roles besides agent and theme and to extend the rules for efficient identification of an argument’s boundary).
منابع مشابه
Presenting a method for extracting structured domain-dependent information from Farsi Web pages
Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...
متن کاملExploring Semantic Roles for Named Entity Recognition in the Molecular Biology Domain
.................................................................................................................................. 3 Acknowledgements ................................................................................................................. 5
متن کاملسیستم شناسایی و طبقهبندی موجودیتهای اسمی در متون زبان فارسی بر پایه شبکه عصبی
Named Entity Recognition (NER) is a fundamental task in natural language processing and also known as a subset of information extraction. We seek to locate and classify named entities in text into predefined categories such as the names of persons, organizations, locations, expressions of times, etc. Named Entity Recognition for English texts has been researched widely for the past years, howev...
متن کاملAn Eye-tracking Study of Named Entity Annotation
Utilising effective features in machine learning-based natural language processing (NLP) is crucial in achieving good performance for a given NLP task. The paper describes a pilot study on the analysis of eye-tracking data during named entity (NE) annotation, aiming at obtaining insights into effective features for the NE recognition task. The eye gaze data were collected from 10 annotators and...
متن کاملAnnotation of Predicate-argument Structure on Molecular Biology Text
Annotated corpora are essential resources for natural language processing. This paper describes our approach for building a corpus annotated with predicateargument structure on research abstracts in molecular biology domain. Observation of the records in a database of cell signaling events and corresponding research abstracts showed that extracting predicateargument structure is a useful interm...
متن کامل